home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Shareware Grab Bag
/
Shareware Grab Bag.iso
/
002
/
cleantxt.arc
/
CLEANTXT.DOC
< prev
Wrap
Text File
|
1987-04-25
|
6KB
|
134 lines
PROGRAM NAME: CLEANTXT
VERSION: 1.0
DATE: 4-24-87
FOR: IBM PC'S OR COMPATIBLES
PROGRAM WRITTEN IN: ASSEMBLY LANGUAGE
AUTHOR: JOHN GASAL
ADDRESS: ANOKA,MN 55303
A: GENERAL DESCRIPTION
CLEANTXT IS A GENERAL PURPOSE DOS FILTER WHOSE NORMAL PURPOSE IS TO CLEAN UP
TEXT FILES BY:
1. ELIMINATING TRAILING BLANKS IN TEXT.
2. REMOVE END OF FILE CHARACTERS (1A HEX) EMBEDDED IN THE
MIDDLE OF TEXT WHICH MAY PREVENT Y0U FROM PROCESSING AN
ENTIRE TEXT FILE. SOMETIMES YOU GET EMBEDDED END OF FILE
CHARACTER WHEN YOU APPEND ONE FILE TO ANOTHER. THE END
OF FILE MARKER AT THE END OF THE FILE IS ALSO REMOVED AS
DOS DOES NOT NEED THIS.
3. FIXING 'FUNNY' CARRIAGE RETURNS(CR) AND LINE
FEEDS(LF) SO THAT EVERY LINE FEED IS ALWAYS PRECEDED BY
A CARRIAGE RETURN.
4. EXPAND TAB MARKS (09 HEX) SO THAT THE PROPER NUMBER OF
SPACES REPLACE THE TAB MARK.
OPTIONALLY THE PROGRAM CAN:
1. DELETE FORM FEED CHARACTERS (0C HEX).
2. CHANGE CHARACTERS TO EITHER LOWER OR UPPER CASE.
3. STRIP HIGH BITS AS ONE MIGHT ENCOUNTER IN A FILE
CREATED BY WORDSTAR. THIS APPLIES TO ASCII CHARACTERS
128 THROUGH 255.
4. DELETE THE HIGH ORDER ASCII CHARACTERS (IE, ASCII
CHARACTERS 128 THROUGH 255).
5. LIMIT THE NUMBER OF BLANK LINES BETWEEN ANY BLOCK OF
TEXT TO A PREDETERMINED NUMBER BETWEEN 0 (IE,SINGLE
SPACE) AND 9.
B. USE OF THE PROGRAM
THE COMPLETE SYNTAX IS:
CLEANTXT <SOURCE_FILE >TARGET_FILE /H /U /L /F /E /+N
COMMENTS:
-THE SOURCE_FILE MUST BE ALWAYS BE INCLUDED.
-IF THE TARGET_FILE IS OMITTED, THE OUTPUT IS SENT TO THE
SCREEN.
-THE SWITCHES SHOWN ARE OPTIONAL AND MEAN:
/H STRIP HIGH BITS
/U MAKE ALL CHARACTERS UPPER CASE
/L MAKE ALL CHARACTERS LOWER CASE
/F DELETE FORM FEEDS
/E DELETE EXTENDED CHARACTERS (ASCII 128 OR GREATER)
/+N SET THE MAXIMUM BLANK LINES YOU WANT LEFT BETWEEN TEXT.
N REFERS TO A DIGIT BETWEEN 0 AND 9.
ANY OR ALL SWITCHES CAN BE USED AT ONE TIME IN THE COMMAND,
ALTHOUGH USING CERTAIN ONES IN PAIRS DOESN'T MAKE SENSE(IE, /U
/L TELLS CLEANTXT TO BOTH MAKE ALL CHARACTERS UPPER AND ALSO
LOWER CASE AT THE SAME TIME!)
C: EXAMPLES
CLEANTXT
CLEANTXT TYPED WITH NO PARAMETERS WILL GIVE A HELP
MESSAGE.
CLEANTXT <FILE1.TXT
THIS WILL CLEAN FILE1.TXT AND SEND THE OUTPUT
TO THE SCREEN. NO OPTIONS USED.
CLEANTXT <FILE1.TXT >FILE2.TXT
THIS WILL SEND THE OUPUT TO FILE2.TXT. NO
OPTIONS USED.
CLEANTXT <FILE1.TXT >FILE2.TXT /U /+1 /F
THIS WILL CLEAN FILE1.TXT, SEND OUTPUT TO
FILE2.TXT. OPTIONS CALLED FOR WILL CONVERT ALL
LETTERS TO UPPER CASE, DELETE FORM FEEDS, AND
ALLOW NO MORE THAN ONE BLANK LINE BETWEEN TEXT.
THIS COMMAND COULD BE USED TAKE A FILE FORMATED
FOR PRINTING AND DELETE THE FORM FEEDS AND
EXTRA BLANK LINES NEAR EACH PAGE BREAK.
CLEANTXT <FILE.COM >FILE.TXT /E /+0
WITH THIS COMMAND YOU CAN LOOK AT THE TEXT IN
ANY 'COM' OR 'EXE' EXECUTABLE FILE. THE
OPTIONS CALLED FOR WILL DELETE ASCII
CHARACTERS GREATER THAN 128 AND WILL ONLY ALLOW
SINGLE SPACE OUTPUT. THIS IS A FASTER WAY OF
LOOKING AT TEXT IN ANY EXECUTABLE FILE THAN
USING DEBUG OR A HEX DUMP PROGRAM.
DIR |CLEANTXT >DIR.TXT /L
HERE THE DOS PIPE COMMAND '|' SENDS THE OUTPUT
OF 'DIR' TO 'CLEANTXT' WHICH SENDS ITS OUTPUT INTO
THE FILE DIR.TXT AFTER MAKING IT LOWER CASE.
D: HISTORY
THE ORIGINAL NEED FOR THIS PROGRAM OCCURRED WHEN I WAS
PREPARING DOCUMENATION FOR A PROGRAM. I WAS USING A SCREEN
CAPTURE PROGRAM TO DUMP THE IMAGE ON THE CRT TO A
FILE. THE FILE CONTAINING THE SCREEN DUMP COULD NOT BE
DIRECTLY EDITED BY MY WORD PROCESSOR, PC-WRITE, BECAUSE THE
CARRIAGE RETURNS AND LINE FEEDS WHERE OUT OF ORDER FROM WHAT
PC-WRITE EXPECTED (PC-WRITE EXPECTS THAT A CARRIAGE RETURN
PRECEEDS EVERY LINE FEED). FURTHER, EVERY LINE IN THE SCREEN
DUMP FILE WAS 80 CHARACTERS WIDE. THIS MEANT THAT A BLANK LINE
WAS REPRESENTED BY 80 SPACES (BIG WASTE OF FILE SPACE!).
THUS, I WROTE THE ORIGINAL CLEANTXT PROGRAM TO DELETE THESE
TRAILING SPACES AND FIX UP THE CARRIAGE RETURN-LINE FEED
SEQUENCE. IN AN ACTUAL TEST, THE CLEANTXT PROGRAM REDUCED THE
SIZE OF A TYPICAL SCREEN DUMP FILE FROM 6656 BYTES TO 1478
BYTES (78% REDUCTION).
THE OTHER OPTIONS WERE ADDED LATER.